Perturbative formulation of general continuous-time Markov model of sequence evolution via insertions/deletions, Part IV: Incorporation of substitutions and other mutations

نویسندگان

  • Kiyoshi Ezawa
  • Dan Graur
  • Giddy Landan
چکیده

Background Insertions and deletions (indels) account for more nucleotide differences between two related DNA sequences than substitutions do, and thus it is imperative to develop a stochastic evolutionary model that enables us to reliably calculate the probability of the sequence evolution through indel processes. In a separate paper (Ezawa, Graur and Landan 2015a), we established the theoretical basis of our ab initio perturbative formulation of a continuous-time Markov model of the evolution of an entire sequence via insertions and deletions along time axis. In other separate papers (Ezawa, Graur and Landan 2015b,c), we also developed various analytical and computational methods to concretely calculate alignment probabilities via our formulation. In terms of frequencies, however, substitutions are usually more common than indels. Moreover, many experiments suggest that other mutations, such as genomic rearrangements and recombination, also play some important roles in sequence evolution. Results Here, we extend our ab initio perturbative formulation of a genuine evolutionary model so that it can incorporate other mutations. We give a sufficient set of conditions that the probability of evolution via both indels and substitutions is factorable into the product of an overall factor and local contributions. We also show that, under a set of conditions, the probability can be factorized into two sub-probabilities, one via indels alone and the other via substitutions alone. Moreover, we show that our formulation can be extended so that it can also incorporate genomic rearrangements, such as inversions and duplications. We also discuss how to accommodate some other types of mutations within our formulation. . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/023622 doi: bioRxiv preprint first posted online Jul. 31, 2015;

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Perturbative formulation of general continuous-time Markov model of sequence evolution via insertions/deletions, Part III: Algorithm for first approximation

Background Insertions and deletions (indels) account for more nucleotide differences between two related DNA sequences than substitutions do, and thus it is imperative to develop a stochastic evolutionary model that enables us to reliably calculate the probability of the sequence evolution through indel processes. In a separate paper (Ezawa, Graur and Landan 2015a), we established an ab initio ...

متن کامل

Investigation of Polymorphisms in Non-Coding Region of Human Mitochondrial DNA in 31 Iranian Hypertrophic Cardiomyopathy (HCM) Patients

The D-loop region is a hot spot for mitochondrial DNA (mtDNA) alterations, containing two hypervariable segments, HVS-I and HVS-II. In order to identify polymorphic sites and potential genetic background accounting for Hypertrophic CardioMyopathy (HCM) disease, the complete non-coding region of mtDNA from 31 unrelated HCM patients and 45 normal controls were sequenced. The sequences were aligne...

متن کامل

Protein insertions and deletions enabled by neutral roaming in sequence space.

Backbone modifications via insertions and deletions (InDels) may exert dramatic effects, for better (mediating new functions) and for worse (causing loss of structure and/or function). However, contrary to point mutations (substitutions), our knowledge of the evolution and structural-functional effects of InDels is limited and so is our capability to engineer them. We sought to assess how delet...

متن کامل

Neuraminidase gene sequence analysis of avian influenza H9N2 viruses isolated from Iran

Influenza A viruses possesses two virion surface glycoproteins including haemagglutinin (HA) and neuraminidase (NA). The NA plays an important role in viral replication and promotes virus release from infected cells and facilitates virus spread throughout the body. To find out any genomic changes that might be occurred on NA gene of avian influenza circulating viruses, we have genetically analy...

متن کامل

Probabilistic Phylogenetic Inference with Insertions and Deletions

A fundamental task in sequence analysis is to calculate the probability of a multiple alignment given a phylogenetic tree relating the sequences and an evolutionary model describing how sequences change over time. However, the most widely used phylogenetic models only account for residue substitution events. We describe a probabilistic model of a multiple sequence alignment that accounts for in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015